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ABSTRACT We have developed a new Bayesian image reconstruc- 
tion method that has been shown to be superior to the best implemen- 
tations of other competing methods, including Goodness-of-Fit meth- 
ods such as Least-Squares fitting and Lucy- Richardson reconstruction, 
as well as Maximum Entropy (ME) methods such as those embodied in 
the MEMSYS algorithms. Our new method is based on the concept of 
the pixon, the fundamental, indivisible unit of picture information. Use 
of the pixon concept provides an improved image model, resulting in an 
image prior which is superior to that of standard ME. Our past work has 
shown how uniform information content pixons can be used to develop 
a “Super- ME” method in which entropy is maximized exactly. Recently, 
however, we have developed a superior pixon basis for the image, the 
Fractal Pixon Basis (FPB). Unlike the Uniform Pixon Basis (UPB) of 
our "Super- ME” method, the FPB basis is selected by employing fractal 
dimensional concepts to assess the inherent structure in the image, lhe 
Fractal Pixon Basis results in the best image reconstructions to date, su- 
perior to both UPB and the best ME reconstructions. In this paper, we 
review the theory of the UPB and FPB pixon and apply our methodology 
to the reconstruction of far-infrared imaging of the galaxy M51. The re- 
sults of our reconstruction are compared to published reconstructions of 
the same data using the Lucy- Richardson algorithm, the Maximum Cor- 
relation Method developed at 1PAC, and the MEMSYS ME algorithms. 
The results show that our reconstructed image has a spatial resolution a 
factor of two better than best previous methods (and a factor of 20 finer 
than the width of the point response function), and detects sources two 
orders of magnitude fainter than other methods. 


BAYESIAN IMAGE RECONSTRUCTION 

Bayesian image reconstruction estimates the best image by statistically modeling 
the imaging process. To do this, one lactors the joint probability distribution 
of the triplet, D , I, and M, i.e., p(D,I,M), (where D , 1 , and M arc the data, 
unblurred image, and model respectively) using Bayes’ Theorem to derive an 
expression for the most probable or M.A.P. (Maximum A Posteriori) image. In 
deriving the expression for the M.A.P. image, most Bayesian methods assume 
that all aspects of the model, M , linking image and the data are to be held 
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constant during the image reconstruction process. More advanced methods, 
such as Weir’s multi-channel method (1991, 1993) and the method presented here 
(also see Pina and Puetter 1993 arid Puetter and Pina 1993), systematically vary 
certain aspects of the model to improve the quality of the image reconstruction. 
Varying the image and the model simultaneously, places I and M on a more equal 
footing and begins to blur the distinction between image and model. Indeed, 
our Bayesian formulation of the image reconstruction problem treats them as 
full equals; see Pina and Puetter (1993) and Puetter and Pina (1993). With this 
in mind, a Bayesian formulation of the image reconstruction problem which is 
symmetric with respect to the image and model can be obtained by factoring 
p(D, I, M) in the following manner: 

P(V-. T M) = p(D\I, M)p{I , M) = p(/, M\D)p(D) (1) 

which implies 


1>{1, M\D) 


p( D\l, M)p(l. M) p(D\I , M)p(I\ M )p( M ) 
p{D) ~ p{D) 


oc p(D\l , M )p(I\M) 


where in the standard notation p(W|E) is the probability of X given that Y is 
known, and the proportionality in equation (2) is obtained since D is constant 
(hence p(D) is constant) and since we have no prior basis on which to dis- 
criminate between models (i.e., p(M) - const). Equation (2) is an expression 
for the M.A.P. image/model pair, i.e., the image/model pair which maximizes 
p( /. M\D). Our met hod assumes that deriving the M.A.P. image/ model pair 
represents the optimal reconstruction. 


The significance of the two terms in the proportionality in equation (2) 
is readily understood. The first term, p(D\I,M), is a goodness-of-fit (GOF) 
quantity, measuring the likelihood of the data given a particular image and 
model. The second term, p(/|A/), the “image prior'’, expresses the a priori 
probability of a particular realization of the image; note that equivalently we 
could use the image/model prior p(/, M). In GOF image reconstruction, p{l\M) 
is eflectively set equal to unity, i.e., there is no prior bias concerning the image. 
A typical choice for p(D\l,M) is to use p{D\l,M) = exp(— y \ 2 /2), i.e., the 
standard chi-square distribution. This approach ensures a faithful rendition 
of the data, but typically results in images with spurious low signal- to- noise 
features. In Maximum Entropy (ME) image reconstruction, the image prior 
is based upon phase space volume” or counting arguments and the prior is 
expressed as p(I\M) = exp(a.S’), where S is the entropy of the image and a is a 
Lagrange multiplier that adjusts the relative importance of the GOF and image 
prior. 


PIXONS AND A NEW IMAGE PRIOR 

By varying the model simultaneously with the image during the reconstruction 
process, the solution to the reconstruction process is optimized over a signif- 
icantly larger solution space. This solution space contains, for example, the 
solutions spaces of the standard GOF and ME methods. Consequently, the re- 
sults of our methods theoretically should be no worse than those of competing 
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methods. While the advantage of varying the model seem clear, it is less obvi- 
ous how to do this in a productive manner. Certain aspects of the model, for 
example, should not be varied. The model includes, for instance, a description 
of the physics of the imaging process, e.g., that the data is the convolution of 
the point-spread-function (PSF) arid the blur- free underlying image. The model 
also includes the description of the noise processes, e.g., the noise might have 
a Gaussian or Poisson distribution. These aspects of the model should not be 
varied if ones wishes to remain faithful to the truth. However, the mathematical 
model or representation of the image can be varied. Furthermore, varying the 
mathematical model of the image can substantially improve the quality of the 
image reconstruction. One relatively successful approach to image modeling was 
suggested by Pina and Puetter 1993 and Puetter and Pina 1993. They pointed 
out that in the most general of terms, an image is a collection of distinguishable 
events which occur in distinct cells. Hence the value for the image prior can be 
determined from simple counting arguments. If there are JV; events in cell i, and 
a total of n cells, then the prior probability of that particular image is: 

p({Ni},n,N)= =P(I\M) , (3) 

where { A; ) is the set of all numbers of events in cells i, and N is the total 
number of events, i.e., N — N’,. The cells used in equation (3) are quite 

general. In fact, we have not specified a size, shape, or position for the cells. 
The goal of selecting cells for a specific image is to maximize p(I\M) subject 
to the constraint that an adequate GOF is maintained. Equation (3) clearly 
indicates the desirable properties of the model, i.e., the model should contain 
the fewest number of cells with each containing the largest number of events. 
We have termed these generalized cells pixons. This terminology is not simply 
whimsical. An image’s pixons represent the smallest number of cells (of arbitrary 
shape, position, etc.) required to fit the data and represent the minimum degrees 
of freedom necessary to specify the image within the accuracy of the noise. If 
properly selected, this set is not reducible to a smaller set. In this sense, the 
pixons are the fundamental particles of information in the image. In fact, using 
a pixon basis is the fulfillment of Occam’s Razor formalized in Bayesian terms. 
Intuitively, this is the goal of every modeling effort, i.e., to fit the data with the 
simplest model. Equation (2) is the embodiment of that goal with the p(D\l, M ) 
term insisting on a good fit and p{I\M) acting for the cause of simple models, 
i.e., acting as an Occam’s Razor term. 

While the goal of pixon-based image reconstruction is now clear, there is 
still considerable uncertainty associated with how to actually select the pixon 
basis. Pina and Puetter (1993) suggested a particular formulation: The Uniform 
Pixon Basis (UPB). In the UPB, each of the pixons contain the same amount of 
information, i.e., iV, is identical for each pixon. Because of this, the UPB image 
representation provides a sort of “Super-ME” reconstruction. This is because 
in the UPB representation each pixon is identical, i.e., in this basis the image is 
exactly flat. Hence entropy is maximized exactly. All of the image’s structural 
information is inherent in the pixon basis. Because entropy is maximized exactly, 
and because the number of cells in the UPB representation is typically much 
smaller that the number of data pixels used in standard ME reconstructions, 
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the formal value of the image prior and hence the value of p(I, M\D) is vastly 
improved. Crudely speaking, using the UPB representation is the mathematical 
equivalent of imaging the sky with a magical CCD which can interactively change 
its number of pixels, their positions, sizes, and shapes in a manner in which each 
pixel collects the same number of photons. 

A FRACTAL PIXON BASIS 

While the UPB image representation is a major advance, the UPB basis is rather 
ad hoc. It can not be justified as an optimal pixon basis. A more satisfying 
pixon basis can be chosen using fractal concepts as we shall now describe- also 
see Puctter and Pina 1993. 

There are numerous definitions for the fractal dimension of a geometric 
object. However all definitions have one thing in common. Each calculates a 
quantity that changes as a scale (or measurement precision) changes. For exam- 
ple, the compass dimension (also referred to as the divider or ruler dimension) 
of a line in a plane surface is defined in terms of how the measured length of the 
line varies as the length of the ruler used to measure the length varies. These 
ideas are closely related to pixons and the image reconstruction problem since 
choice of a pixon basis is essentially asking the question: “How does the GOT 
change as the size of the pixons are varied?” 

In forming their UPB image model, Pina and Puetter (1993) use a pseudo- 
image calculated on a pseudo-grid (usually equal to or finer than the data pixel 
grid). At each point of the pseudo-grid a local scale is determined. This is the 
local pixon size. 1 he image is then set equal to the pseudo-image convolved 
with the pixon kernel (or shape) function of the local width. This image is then 
convolved with the PSF and compared to the data to determine the GOT. Using 
the above prescription, the pixons are not cells with sharp boundaries, but are 
“fuzzy”. However, since the image values on the pseudo-grid are correlated, the 
number of degrees of freedom in the image are significantly less than the number 
of pseudo-pixels. 

We are now ready to consider how to improve the UPB representation using 
fractal concepts. Occam s Razor tells us that we must use the fewest number of 
pixons each containing the largest information content. The GOF term tells us 
we must fit the data as well as possible. Using fuzzy pixons, we know that at each 
point in the image we are going to smooth a pseudo-image over a local scale. 
What aspect of the smoothing process will prevent an adequate GOF value? 
Clearly, if the underlying image is locally smooth on a given scale, smoothing 
the pseudo-image on scales less than or equal to this scale loses nothing. The 
GOF term is unaffected and the image prior improves. On the other hand, 
smoothing on scales larger than the local scale is detrimental. So how does 
one tell if the pseudo-image is being smoothed too much? If the true image 
is smoothed with pixons of very small width, the value of the smoothed image 
changes inappreciably. As the widths of the pixons are increased, deviations of 
the smoothed value from the unsmoothed value become appreciable when the 
pixon size exceeds the local smoothness scale of the image. This provides an 
over-smoothing signal. The only remaining issue is determining what size signal 
is significant. We know, however, by what tolerance we can allow the smoothed 
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image to depart from the unsmoothed image. The departure must not exceed 
the local noise. Hence the determination of the local pixon size is an inverse 
fractal dimension problem. What we arc seeking is the largest local smoothing 
scale consistent with the tolerance set by the noise. 

RESULTS: 60 MICRON IRAS IMAGING OF M51 

As a test of the ability of FPB-based image reconstruction we have reconstructed 
an image from 60/rm IRAS survey scans of the interacting galaxy pair M51. This 
data was selected for several reasons. First, M51 is a well studied object at op- 
tical, IR, and radio wavelengths. Hence “reality” for this galaxy is well known. 
Second, this particular data set was chosen as the basis of an image reconstruc- 
tion contest at the 1990 MaxEnt Workshop in Laramie, Wyoming (see Bontekoe 
1991). Consequently, there have been a number of serious attempts at perform- 
ing image reconstruction on this data set by specialists in the field, finally, 
the IRAS data for this object is particularly strenuous for image reconstruction 
methods. This is because first, all the interesting structure is on sub-pixel scales 
(IRAS employed relatively large, discrete detectors (1.5 ' by 4.75' at 60// m) and 
the position of M51 in the sky caused all scan directions to be nearly parallel. 
This means that reconstructions in the cross-scan direction (i.e., the 4.75' direc- 
tion along the detector length) should be significantly more difficult than in the 
scan direction. In addition, the point source response of the 15 IRAS 60 urn 
detectors (pixel angular response) is known oidy to roughly 10% accuracy, and 
finally, the data arc not an image, but incompletely and irregularly sampled 
scans of individual detectors across the galaxy. 

Our FPB reconstruction appears in Figure 1 along with Lucy- Richardson 
and Maximum Correlation Method (MCM) reconstructions (Rice 1993) and a 
MEMSYS 3 reconstruction (Bontekoe tt al. 1991)-sce Gull (1991) for a de- 
scription of the MEMSYS algorithms. The winning entry to the MaxEnt 90 
image reconstruction contest was produced by Nick Weir of Caltech and is not 
presented here since quantitative information concerning this solution has not 
been published— see Bontekoe (1991) for a gray-scale picture of this reconstruc- 
tion. However, Weir’s solution is qualitatively similar to Bontekoe’s solution 
(Weir 1993). Both were made with MEMSYS 3. Weir’s solution, however, used 
a single correlation length channel in the reconstruction. This constrained the 
minimum correlation length of features in the reconstruction, preventing break- 
up of the image on smaller size scales. This is probably what resulted in the 
winning edge for Weir’s reconstruction in the MaxEnt 90 contest (Weir 1993). 

As can be seen from Figure 1, our FPB-based reconstruction is clearly su- 
perior to those produced by other methods. The Lucy- Richardson and MCM 
reconstructions fail to significantly reduce image spread in the cross-scan direc- 
tion, i.e., the rectangular signature of the 1.5 by 4.75 aremin detectors is still 
clearly evident, and fail to reconstruct even gross features such as the “hole” 
(black region) in the emission north ol the nucleus— this hole is clearly evident in 
optical images of M51. The MEMSYS 3 reconstruction by Bontekoe is signifi- 
cantly better. This image clearly recovers the emission “hole” and resolves the 
NE and SW arms into discrete sources. Nonetheless, the level ol detail present 
in the FBP reconstruction is clearly absent, e.g., the weak source centered in 
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FIGURE 1 Imago reconstructions of the Interacting galaxy M51. (a) FPU- 
based reconstruction; (b) MEMSYS 3 Reconstruction; (c) Lucy- Richardson Fig- 
ure of panel (b) reproduced from Bontekoe (1991), by permission of the authors. 
Figures ol panels (c) and (d) reproduced from Rice 1993 by permission of the 
author. 


the emission hole (again, this feature corresponds to a known optical source), 
and the fainter sources around the periphery of the image (most of which are 
known radio or optical sources). One also notes that the resolution of the FPB 
reconstruction is roughly a factor of two greater than the MEMSYS 3 recon- 
struction. However, to be fair to the MEMSYS 3 reconstruction, the authors 
rebinned their reconstruction by a factor of two to eliminate any possible spuri- 
ous features. However, we are quite confident that all of the features present in 
our reconstruction are real. 

Aside from the fact that most of the sources can be identified with emis- 
sion at other wavelengths, the residual errors in our reconstruction are much 
smaller than in the MEMSYS 3 reconstruction. As pointed out by Bontekoe 
ft at - (1991) the peak flux in the MEMSYS 3 reconstruction is 2650 units. 
The residual errors are correlated with the signal and lie between 0 and 430 
units. By contrast, the peak value in the FPB reconstruction is 3290 units, 
the residuals are uncorrelated with the signal, and the residuals lie between -9 
and 1/ units. (I he contour levels for the MEMSYS 3 and FPB reconstructions 
are identical and are 150, 300, 600, 1200, and 2400 units.) Furthermore, the 
large deviation residuals in the FPB reconstruction are due to systematic errors 
involving incomplete scan coverage of M51. As mentioned above, these errors 
do not lie under the significant flux emitting portions of the M51 image. The 
residual errors associated with emitting regions in M51 are significantly smaller 
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and show a roughly Gaussian distribution. Consequently, the residuals in the 
FPB reconstruction represent a two order- of- magnitude improvement over the 
MEMSYS 3 residuals. This, of course, immediately explains the absence of the 
fine features in the MEMSYS 3 reconstruction. The weak sources present in the 
FPB reconstruction all lie in the 150 to 300 unit range, and hence are 30 or 
greater detections in the 1* PB reconstruction. By contrast, these same features 
would be less than one detection in the MEMSYS 3 reconstruction. 
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